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Abstract 

We consider a continuous-review inventory system in which the setup cost of each order is a 
general function of the order quantity and the demand process is modeled as a Brownian motion 
with a positive drift. Assuming the holding and shortage cost to be a convex function of the 
inventory level, we obtain the optimal ordering policy that minimizes the long-run average cost 
by a lower bound approach. To tackle some technical issues in the lower bound approach under 
the quantity-dependent setup cost assumption, we establish a comparison theorem that enables 
one to prove the global optimality of a policy by examining a tractable subset of admissible 
policies. Since the smooth pasting technique does not apply to our Brownian inventory model, 
we also propose a selection procedure for computing the optimal policy parameters when the 
setup cost is a step function. 


1 Introduction 


Classical inventory models usually assume a setup cost when an order is placed or a production 
run is started to replenish the inventory. It is well known that an ordering policy of the (s, S) type 
is optimal for the backlogging inventory problem when the setup cost is constant for any order or 
production quantity; see Scarf ( 1960| , Iglehart (1963), and Veinott (1966). Arising from various 
activities, order setup costs are more complex in practical inventory systems and often depend 
on order quantities. In this paper, we take quantity-dependent setup costs into consideration and 
investigate the optimal ordering policy that minimizes the long-run average cost. 

Setup costs may grow as order quantities increase. For example, if an order is shipped to the 
buyer by multiple vehicles, a shipping fee may be charged for each of them. If a vehicle’s capacity 
is Q and the shipping fee is F, the total shipping cost is a nondecreasing step function of order 
quantity given by 

e 


K{0 = F 


Q 


( 1 . 1 ) 


The study of stochastic inventory models with such a setup cost can be traced back to [Lippman 


(1969), where the ordering cost is assumed to be a nondecreasing, subadditive function of the 
order quantity. Lippman considered a periodic-review model and proved the existence of optimal 
ordering policies for both the finite-horizon problem and the discounted, infinite-horizon problem. 
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It is pointed out that at the beginning of each period, it is optimal to replenish the inventory 
when it drops below a certain level and not to order when it is above another level. The optimal 
ordering decisions, however, are not specified for inventory falling in other regions. With the setup 


cost in ( 

1.1) 

Iwaniec ( 

1979 

) identified a set of conditions under which a full-batch-ordering policy 

is optimal. 

Alp et al. 

(2014 

) allowed orders with partial batches in their policies and partially 


characterized the optimal ordering policy that minimizes the long-run average cost. Chao and 


Zipkin (2008) considered a simple quantity-dependent setup cost 

= •!(/?,oo)(0, 


( 1 . 2 ) 


where 1a denotes the indicator function of A C M. This formulation describes the administra¬ 
tive cost under a supply contract with a capacity constraint: No extra cost is incurred if the 
order quantity does not exceed the contract volume R] otherwise, the buyer is required to pay 
an administrative fee F. The authors partially characterized the optimal ordering policy for the 


periodic-review model and developed a heuristic policy. Caliskan-Demirag et al. (2012) investi¬ 


gated several simple forms of nondecreasing, piecewise constant setup costs, including both (1.1) 


and (1.2). They also provided partial characterization for the optimal ordering policies. 


As opposed to the increasing fee structure, setup costs may also decrease for large orders. To 
achieve economies of scale in production and distribution, suppliers in e-commerce often provide 
shipping discounts or free shipping for large orders. Such promotions are useful in generating 


additional sales. As pointed out by Lewis et al. (2006), the shipping policies that provide incentives 


for large orders may bring more profits to suppliers than standard increasing shipping fees and free 


shipping promotions. Zhou et al. (2009) analyzed a periodic-review inventory model with a free 


shipping option from a buyer’s perspective. The setup cost in their paper is 


K{0 = F-1^o,r){0, (1-3) 

i.e., the supplier imposes a shipping fee F when the order quantity is less than R, but waives this 
charge if the order quantity exceeds R. They found the optimal ordering policy for the single-period 
problem and proposed a heuristic policy for the multiple-period model. 

In practical inventory systems, order setup costs may arise from multiple activities in admin¬ 
istration and transportation. The costs incurred by some activities may increase with the order 
quantity while others may decrease. As a result, the total setup cost of an order may not be mono¬ 
tone with respect to the order quantity. The setup cost function in this paper takes a very general 
form, where K : M_|_ —)> M is assumed to satisfy the following conditions: 

(51) K is nonnegative with K{0) = 0; 

(52) K is bounded; 

(53) K has a right limit at zero, and if K{0+) = 0, K has a finite right derivative at zero; 

(54) K is lower semicontinuous, i.e.. 


F{i) < liminf Ar(^) for f > 0. 


Both the setup cost in (1.2) and that in (1.3) satisfy (S1)”(S4). As a technical requirement, 
condition (S4) ensures that the optimal average cost is attainable. The practical interpretation of 
this condition is as follows: If the setup cost function has a jump at condition (S4) implies that 
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the buyer is allowed to pay the lower fee of K{^—) and K{^+), which essentially takes account of 
the possibility that the buyer may adjust the order by a small quantity so as to pay a less setup fee. 


Conditions (S1)-(S4) are similar to the assumptions in Perera et al. (2015), where the optimality of 


{s, S) policies is proved for economic order quantity (EOQ) models under a general cost structure. 

Besides the setup cost, each order incurs a proportional cost with rate k > 0. To place an order 
of quantity ^ > 0, the manager is required to pay an ordering cost of 


Ci0 = K{0 + kC. 


(1.4) 


We do not allow multiple simultaneous orders, i.e., the ordering cost must follow (1.4) as long as 


the total order quantity at an ordering time is equal to In (1.4), it would be more appropriate 


to interpret K{^) as the non-proportional part of the ordering cost, instead of the fixed cost in 
the usual sense. Accordingly, kf^ represents the proportional part, and k should be understood 
as the increasing rate rather than the unit price of inventory. By decomposing the ordering cost 
into proportional and non-proportional parts, this formulation allows for unbounded setup cost 


functions. For example, although the setup cost in (1.1) does not satisfy (S2), we may decompose 
it into 


/V{) = f+ F 


where the first and second terms are proportional and non-proportional terms, respectively. Since 


the non-proportional term satisfies (S1)-(S4), we may take it as the non-proportional part of the 


ordering cost and {k -\- F/Q)f^ as the proportional part. Thanks to the general form of the non¬ 


proportional cost, the ordering cost function given by (1.4) includes most ordering cost structures 


in the literature, such as ordering costs with incremental or all-unit quantity discounts; see Porteus 


(1971, 1972) and Altintas et al. (2008). For the sake of convenience, we still refer to the non¬ 


proportional part of the ordering cost as the setup cost. 

Stochastic inventory models with a general setup cost function are analytically challenging. 
Since the ordering cost function may be neither convex nor concave, it is difficult to identify the 
cost structures that can be preserved through dynamic programming. As we mentioned, the optimal 
ordering policy for the periodic-review model has not been fully characterized, even if the setup 


cost function takes the simplest form as in (1.2) or (1.3). The partial characterization also suggests 


that the optimal periodic-review policy would be complicated. 

In this paper, we assume that the inventory is constantly monitored and an order can be 
placed at any time. To the best of our knowledge, this is the first attempt to explore optimal 
ordering policies for continuous-review inventory systems with quantity-dependent setup costs. In 
the literature on periodic-review inventory models, it is a common practice to approximate customer 
demand within each period by a normally distributed random variable; see, e.g.. Chapter 1 in 
Porteus (2002) and Chapter 6 in Zipkin (|2000 ). Brownian motion is thus a reasonable model 


for demand processes in continuous-time inventory systems; see, e.g.. Bather (1966) and Gallego 


( ]1990 ). With a Brownian demand process, the optimal ordering policy can be obtained by solving 
a Brownian control problem, which turns out to be more tractable than solving a dynamic program 
when the setup cost is quantity-dependent. This is because with a continuous demand process, 
the manager is able to place an order at any inventory level as he wishes. Since future demands 
are independent of the history, finding the optimal ordering policy is reduced to finding constant 
reorder and order-up-to levels that jointly minimize the long-run average cost. In periodic-review 
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models, by contrast, the manager is allowed to place an order only at the beginning of a period. As 
the inventory level varies from period to period, the optimal order decision at each period depends 
on both the current inventory level and the prediction of the future inventory level. The resulting 


dynamic program is difficult to solve when the setup cost function takes a general form; see Chao 


and Zipkin (2008) and Caliskan-Demirag et al. (2012) for more discussion. 


In our model, inventory continuously incurs a holding and shortage cost that is a convex function 
of the inventory level. With the aforementioned assumptions, we prove that an (s, S) policy can 
minimize the long-run average cost, and that the optimal reorder and order-up-to levels, 

can be obtained by solving a nonlinear optimization problem. When the setup cost function satisfies 
Ar(0+) = 0, we prove that s* = S* < 0 holds under certain conditions, in which case the optimal 
ordering policy becomes a base stock policy that maintains inventory above a fixed shortage level. 

Brownian inventory models were first introduced by Bather (1966). In his pioneering work, 
Bather studied the impulse control of Brownian motion that allows upward adjustments. Assuming 
a constant setup cost and a convex holding and shortage cost, he obtained the (s, S) policy that 
minimizes the long-run average cost. Bather’s results have been extended to more general settings 
by a number of studies under the constant setup cost assumption. Among them, the (s, S) policy 
that minimizes the discounted cost was obtained by Sulem (1986) with a piecewise linear holding 


and shortage cost, and by Benkherouf (2007) with a convex holding and shortage cost. Bar-Ilan 


and Sulem (1995) obtained the optimal (s, S) policy for a Brownian inventory model that allows for 


constant lead times, and Muthuraman et al. (2015) extended their results to a Brownian model with 
stochastic lead times. Bensoussan et al. (2005) and Benkherouf and Bensoussan (2009) studied a 
stochastic inventory model where the demand is a mixture of a Brownian motion and a compound 
Poisson process; the optimal policy for this model is of the (s, S) type again. Using the fluctuation 
theory of Levy processes, Yamazaki (2013) generalized their results to spectrally positive Levy 
demand processes. In the above papers, the optimal ordering policies are obtained by solving a 
set of quasi-variational inequalities (QVIs) deduced from the Bellman equation. For computing 
the optimal parameters, one needs to impose additional smoothness conditions at the reorder and 
order-up-to levels. This technique, widely known as smooth pasting, is essential to solve a Brownian 
control problem by the QVI approach. See Dixit (1993) for a comprehensive account of smooth 
pasting and its applications. 


Harrison et al. (1983) studied the impulse control of Brownian motion allowing both upward and 


downward adjustments, for which a control band policy is proved optimal under the discounted cost 
criterion. In that paper, the authors adopted a two-step procedure which has become a widely used 
approach to solving Brownian control problems: In the first step, one establishes a lower bound for 
the cost incurred by an arbitrary admissible policy; such a result is often referred to as a verification 
theorem. In the second step, one searches for an admissible policy to achieve this lower bound; the 
obtained policy, if any, must be optimal. The technique of smooth pasting is also a standard 
component of the lower bound approach. By imposing additional smoothness conditions at the 
boundary of a control policy, one may obtain the optimal control parameters through solving a 
free boundary problem. Following this approach, Ormeci et al. (2008) obtained the optimal control 


band policy under the long-run average cost criterion. Both Harrison et al. (1983) and Ormeci 


et al. (2008) assumed a constant setup cost and a piecewise linear holding and shortage cost. Their 


results were extended by Dai and Yao ( 2013a| |b), who allowed for a convex holding and shortage 
cost and obtained the optimal control band policies under both the average and discounted cost 
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criteria. Using the lower bound approach, Harrison and Taksar (1983) and Taksar (1985) studied 
the two-sided instantaneous control of Brownian motion, where double barrier policies are proved 
optimal under different cost criteria. Baurdoux and Yamazaki (2015) extended the optimality of 
double barrier policies to spectrally positive Levy demand processes. The lower bound approach 
was also adopted by Wu and Chao (2014), who studied optimal production policies for a Brownian 
inventory system with finite production capacity, and by Yao et al. (2015), who studied optimal 
ordering policies with a concave ordering cost. The optimal policies in these two papers are of the 
(s, S) type. 

We follow the lower bound approach in this paper, while new issues arise from our Brown¬ 
ian model. We establish a verification theorem in Proposition It states that if there exists a 
continuously differentiable function / and a positive number ^ such that they jointly satisfy some 
differential inequalities, the long-run average cost under any admissible policy must be at least u. 
We derive this lower bound using Ito’s formula, as in the previous studies by Harrison et al. (1983), 


Ormeci et al. (2008), and Dai and Yao (2013a|b ). The Brownian model in those papers allows both 


upward and downward adjustments, so a control band policy is expected to be optimal. Under such 
a policy, the inventory level is confined within a finite interval and the associated relative value 
function is Lipschitz continuous. This fact allows them to assume / to be Lipschitz continuous in 
the verification theorems. With this assumption, one can prove the lower bound by relying solely 
on Ito’s formula. In our model, however, only upward adjustments are allowed. The optimal policy 
is expected to be an (s, S) policy whose relative value function is not Lipschitz continuous. Without 
the Lipschitz assumption, it is difficult to prove the verification theorem in a direct manner. This 
problem was also encountered by Wu and Chao (2014) and Yao et al. (2015). In their papers, the 
lower bound results are established for a subset, rather than all of admissible policies; accordingly, 
the proposed (s, S) policies are proved optimal within the same subset of policies. 

We prove a comparison theorem to tackle this issue. Theorem in this paper states that for 
any admissible policy, we can always find an admissible policy that has a finite order-up-to bound 
and whose long-run average cost is either less than or arbitrarily close to the average cost incurred 
by the given policy. In other words, if an ordering policy can be proved optimal within the set of 
policies having order-up-to bounds, it must be optimal among all admissible policies. This result 
allows us to prove the verification theorem by examining an arbitrary admissible policy that is 
subject to a finite order-up-to bound. With an order-up-to bound, we no longer require / to be 
Lipschitz continuous for establishing the verification theorem by Ito’s formula. 

For an (s. S') policy, the associated relative value function and the resulting long-run average cost 
jointly satisfy a second-order ordinary differential equation along with some boundary conditions; 
see Proposition!^ for the solution to this equation. We use this relationship to compute the optimal 
reorder and order-up-to levels. In the literature, the optimal (s, S) policies for Brownian models 
with a constant setup cost are obtained by imposing smooth pasting conditions on the ordinary 


differential equations; see Bather 


Sulem (1986), Bar-Ilan and Sulem (1995), and Wu and 


Chao (2014). Unfortunately, our Brownian model does not preserve this property because the 


general setup cost function has imposed a quantity constraint on each setup cost value. With 
these constraints, the smooth pasting conditions may no longer hold at the optimal reorder and 
order-up-to levels. Without definite boundary conditions, we can neither define a free boundary 
problem nor solve the QVI problem for the optimal (s, S) policy. To obtain the optimal ordering 
policy, we need to minimize the long-run average cost by solving a nonlinear optimization problem. 
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When the setup cost is a step function, we develop a policy selection algorithm for computing the 
optimal policy parameters. 

The contribution of this paper is twofold. First, by assuming a Brownian demand process, 
we obtain the optimal ordering policy for inventory systems with quantity-dependent setup costs, 
filling a long-standing research gap. The optimality of (s, S) policies is extended to a significantly 
more general cost structure. Although the optimal policy is obtained using a continuous-review 
model, it will shed light on periodic-review models, presumably serving as a simple and near- 
optimal solution. Second, the comparison theorem and the policy selection algorithm complement 
the well-established lower bound approach to solving Brownian control problems. Theoremj^in this 
paper enables one to prove the optimality of a policy by examining a tractable subset, instead of all 
admissible policies. The constructive proof of this theorem can be extended to similar comparison 
results with minor modification. Using modified comparison theorems, we expect that both the 


production policy proposed by Wu and Chao (2014) and the ordering policy proposed by Yao et al. 


(2015) will be proved globally optimal. Besides inventory control, our approach may also be used 


for solving Brownian control problems arising from financial management (see, e.g.. 

Constantinides 

1976 

and Paulsen 2008), production systems (Wein 1992, Veatch and Wein 1996 

and Ata et al. 

2005 

1 , and queueing control (Ata 2006 and Rubino and Ata 2009). 


The rest of this paper is organized as follows. We introduce the Brownian inventory model in 
^ and present the main results in ^ A lower bound is derived in ^ for the long-run average 
cost under an arbitrary admissible policy. The relative value function and the average cost under 
an (s, S) policy are analyzed in ^ Using these results, we prove the optimality of the proposed 
policy in ^ Section is dedicated to the proof of Theorem which enables us to investigate an 
optimal policy within a subset of admissible policies. We introduce a policy selection algorithm in 
^ for obtaining the optimal ordering policy when the setup cost is a step function. The paper is 
concluded in ^ and we leave the proofs of technical lemmas to the appendix. 

Let us close this section with frequently used notation. Let be a real-valued function defined 
on M. We use to denote its increment at t, i.e., A<^(t) = ^p{t+) — if the one-sided 

limits exist. We use and ^p"{t) to denote its first and second derivatives at t, respectively. 


2 Brownian inventory model 

Consider a continuous-time inventory system whose inventory level at time t > 0 is denoted by 
Z{t). We allow Z{t) to be less than zero, in which case |■Z'(^)| is interpreted as the back order or 
shortage level. We assume that all unsatisfied demands will be back-ordered and that the lead time 
of each order is zero. Let D{t) and Y{t) be the cumulative demand quantity and the cumulative 
order quantity during [0,t], respectively. The inventory level at time t > 0 is given by 

Zit)=x-D{t) + Y{t), 

where x is a real number. We refer to Z = {Z{t) : t > 0} as the inventory process, and put 
Z(0—) = X which is interpreted as the initial inventory level. We assume that the cumulative 
demand process D = {D{t) : t > 0} is a Brownian motion that starts from D(0) = 0 and has drift 
/i > 0 and variance > 0. In other words, D has the representation 

D{t) = fit — aB{t), 
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where B = {B{t) : t > 0} is a standard Brownian motion defined on a filtered probability space 
(n, J-, F, F) with filtration F = : t > 0}. Then, the inventory level at time t can be written as 


Z{t) = X{t) + Yit), 


( 2 . 1 ) 


where 


X{t) = X — jit + aB{t). 


( 2 . 2 ) 


We refer to X = {W(t) : t > 0} as the uncontrolled inventory process. The system manager 
replenishes the inventory according to a non-anticipating ordering policy, which is specified by the 
cumulative order process Y = {Y{t) : t > 0}. More specifically, an ordering policy is said to be 
admissible if Y satisfies the following three conditions. First, for each sample path ui € Q, Y{uj, •) 
is a nondecreasing function that is right-continuous on [0, oo) and has left limits on (0, oo). Second, 
Y{t) >0 for all t>0. Third, Y is adapted to F, i.e., Y{t) is T'(t)-measurable for all t>0. We use 
Id to denote the set of all admissible ordering policies, or equivalently, the set of all cumulative order 
processes that satisfy the above three conditions. With the convention y(0—) = 0, an admissible 
policy Y is said to increase at time t > 0 if T(tt) — Y (t—) > 0 for all u > t. We call t an ordering 
time if Y increases at t. Let I{t) be the cardinality of the set {u G [0,t] : Y increases at u}, which 
is interpreted as the number of orders placed by time t. Moreover, t is said to be a jump time if 
AT (t) > 0. Let 

W(t) = Y{t) - ^ AY{u). (2.3) 

0<u<t 

Then, Y^ = {Y'^{t) : t > 0} is the continuous part of Y. 


Each order incurs an ordering cost given by (1.4), with k > 0 and K satisfying (S1)-(S4). If 


iL(0-|-) > 0, we only need to consider the policies that place finitely many orders over a finite time 
interval, i.e., I{t) < oo almost surely for t > 0, because otherwise, either the cumulative ordering 
cost or the cumulative holding and shortage cost will be infinite by time t. In other words, when 
iL(O-l-) > 0, we consider Y that is a piecewise constant function on almost all sample paths, which 
implies that Y^{t) = 0 for all t > 0 almost surely. Therefore, the cumulative ordering cost during 
[0, t] is given by 

Cy(t) = K{AY{u)) + kY{t). (2.4) 


0<u<t 


When iL(0-|-) = 0, the manager may also exert inventory control through the continuous part of 
Y, which may no longer be a zero process. To analyze the setup cost incurred by let us put 


ao c 


(2.5) 


By (S3), i is the right derivative of K at zero if K{0+) = 0. We would thus interpret i as the unit 
setup cost when the order quantity is infinitesimal. Besides a proportional cost of k, every unit 
increment of Y'^ incurs a setup cost of i. Hence, the cumulative ordering cost during [0,t] is 


Cy(t) = Y K{AY{u)) + kY{t)+iY%t). 


( 2 . 6 ) 


0<u<t 


Note that £ = oo if iL(O-l-) > 0. Following the convention that 0 • oo = 0, we may take (2.4) as a 
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special case of the cumulative ordering cost given by (2.6). 


In addition to the ordering cost, the system incurs a holding and shortage cost that is charged 
at rate h{z) when the inventory level is z. More specihcally, h{z) is the inventory holding cost per 
unit of time for z > 0, and is the shortage cost of back orders per unit of time for z < 0. The 
cumulative holding and shortage cost by time t is thus given by 


'Hvit) = f h{Z{u))du, 
Jo 


(2.7) 


which depends on the ordering policy through the inventory process. We assume that h satisfies 
the following conditions: 


(HI) h{0) = 0; 

(H2) h is a convex function; 

(H3) h is continuously differentiable except at z = 0; 

(H4) h'{z) > 0 for z > 0 and h'{z) < 0 for z < 0; 

(H5) h is polynomially bounded, i.e., there exists a positive integer a and two positive numbers bg 
and 6i such that h(z) < bg + bi\z\‘^ for all z G M. 


In particular, with /3i,/32,/3 > 0, both the piecewise linear cost 


h{z) = 


f3iz for z > 0 

— /32Z for z < 0 


and the quadratic cost h{z) = f3z^ satisfy (H1)-(H5). 

Given the initial inventory level x and the ordering policy Y £ U, the long-run average cost is 
defined as 

AC(x,y) = limsup-Ea,[Cy (t)-|-(t)], 

t—^oo t 


where is the expectation conditioning on the initial inventory level Z(0—) = x. By (2.6)“(2.7), 
the long-run average cost is given by 

1 r Y _ 1 

AC(x, y) = limsup-Ea; / h{Z{u))du+ K(AY(u)) + kY(t) + (t) . (2-8) 


t—^OO 


0<u<t 


When Ar(0-|-) > 0, we only need to consider ordering policies having piecewise constant sample 
paths. Such a policy can be specified by a sequence of pairs {{Tj,^j) : j = 0,1,...} where tj is the 
jth. ordering time and is the quantity of the jth order. By convention, we set tq = 0 and let .^o 
be the quantity of the order placed at time zero (^o = 0 if no order is placed). With this sequence, 

'7(f\ 

the ordering policy Y can be specified by Y{t) = Ylj=o where J{t) = max{j > 0 : Tj < t}. On 
the other hand, if the ordering policy Y is given, we can obtain each ordering time by 

Tj = inf{t > Tj-i : Y{t) > Y(t—)} for j = 1, 2,... 


and each order quantity by 


^ (^i) - ^(T?-) foi" J = • 









Therefore, finding an optimal ordering policy when K{0-\-) > 0 is equivalent to specifying a sequence 
of optimal ordering times and order quantities : j = 0,1,...}, which turns out to be an 

impulse control problem for the Brownian model. For the ordering policy Y to be adapted to F, 
we require each Tj to be an F-stopping time and to be T'(rj )-measurable. 

When the setup cost has K(0+) = 0, the manager may adjust the inventory level using the 
continuous part of Y without incurring infinite costs. If Y^ is not a zero process, we will have 
I{t) = oo for some t > 0 with a positive probability. It may happen that the optimal ordering 
policy has continuous sample paths except for a possible jump at time zero. In this case, the 
ordering problem becomes an instantaneous control problem for the Brownian model. 


3 Main results 


The main results of this paper are presented in this section. Theorem states that with a setup 


cost that satisfies (S1)-(S4) and a holding and shortage cost that satisfies 


-(H5), the optimal 


ordering policy for the Brownian inventory model is an (s, S) policy with s < S. In addition, the 
optimal reorder and order-up-to levels {s*,S*) satisfy s* < S* if ii'(0+) > 0, and satisfy s* < S* 
if K{0+) = 0. As a special case, the optimal ordering policy becomes a base stock policy when 
s* = S*. We also provide a comparison result in Theorem which is a technical tool for proving 
the first theorem by the lower bound approach. 

Under an (s, S) policy, as long as the inventory level drops below s, the manager places an 
order that replenishes the inventory to level S immediately. We use U{s,S) to denote this policy. 
Clearly, U{s,S) gU ior s < S. An (s, S) policy with s < S can be specified by the sequence of 
pairs {(tj, : j = 0,1,...} as follows. With tq = 0 and 


^0 


S — X if X < s, 
0 if X > s. 


the jth order is placed at time Tj = inf{t > Tj-i : Z(t—) < s} with a constant quantity = S — s. 

If the reorder and order-up-to levels are equal, the (s, S) policy becomes a base stock policy. 
Under the base stock policy, if the initial inventory level x is below the base stock level s, the 
manager places an order of quantity s — x at time zero that replenishes the inventory to Z{0) = s; 
otherwise, the manager does not order at time zero. After that, whenever the inventory level drops 
below s, the manager brings it back to s immediately. Such a policy is well defined for our Brownian 
model, and the inventory process under that has an analytic expression—see the lemma below. 


Lemma 1. Let s be a real number and C[0, oo) the set of real-valued continuous functions on [0, oo). 
Then for each cj) G C[0,oo), there exists a unique pair of functions (t/, C) £ C[0,oo) x C[0,oo) such 
that (i) rj is nondecreasing with rj{0) = (s — (/>( 0 ))+; (ii) ((t) = 4>{t) -\- rj{t) > s for t > 0 ; (hi) rj 
increases only when Cit) = s, i.e., 


Specifically, 


poo 

/ {(it) - s) dr]{t) 

Jo 


= 0 . 


r]{t) = sup (s — (f>{u))^ for t > 0. 

0<u<t 
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This lemma is a modified version of Proposition 2.1 in Harrison (2013). The proof is similar and 


thus omitted. Under the base stock policy, the inventory process in our Brownian model becomes 
a reflected Brownian motion with lower reflecting barrier at s. By Lemma the cumulative order 
quantity during [ 0 , t] is 

Y{t)= sup (s —X(u))“'“, 

0<u<t 


where X{u) is given by (2.2). Clearly, Y is admissible for each s G M. Because Y has continuous 
sample paths, for each t > 0, there are infinite ordering times in [ 0 , t] with a positive probability. 

Before stating the main theorem, let us introduce a proposition that characterizes the optimal 
policy of the (s, S) type. In particular, the long-run average cost under an (s, S) policy has an 
analytic expression, which is given by (3.1) below and will be proved in ^ (see Proposition]^. 


Proposition 1. Assume that the setup cost K satisfies (S1)--(S4) and that the holding and shortage 


cost h satisfies (H1)-(H5). Let 


7 (s,S') = < 


+ + ^ r r h{u + y)e-^^dudy ifs<S, 

S - s S - s Jq 


(3.1) 


{k + £)y, + X / h{u + s)e du 
Jo 


if s = S, 


where A = 2/i/(T^. Then, there exists {s*,S*) G such that 


7 (s*, S*) = inf{ 7 (s, S) : s < S}. 


(3.2) 


If K{0+) > 0, the minimizer {s*,S*) satisfies s* < z* < S*, where z* is the unique solution to 


poo 

A / h{u + du = h{z*) 

Jo 


(3.3) 


and satisfies z* < 0; if K{0+) = 0, the minimizer satisfies either s* < z* < S* or s* = z* = S* 


Remark 3.1. The long-run average cost under the (s,5) policy is given by 7 ( 5 , 5) in (3.1). When 


s < S, the first expression in (3.1) can be interpreted as follows. Since the quantity of each order is 


S — s, the long-run average proportional and setup costs are ky and K{S — s)y/{S — s), respectively. 
The inventory process under the (s, S) policy is regenerative. Within each cycle, the trajectory of 
Z is identical to that of a Brownian motion starting from S with drift —y and variance so a 


cycle length has the same distribution as the first hitting time of s by X in ( 2 . 2 ) with X(0) = S. 
More specifically, assuming X(0) = x, let us put 


T{y) = mf{t >0 : X{t) = y} and H,fiy)=^cc 


rny) 


h{X{u)) du 


'-JO 


(3.4) 


Then with X(0) = S, the length of a cycle can be represented by T[s) and the expected holding 
and shortage cost during a cycle is Hs{s). The long-run average holding and shortage cost is thus 
equal to H 5 '(s)/E 5 [T(s)] (see, e.g.. Theorem VI.3.1 in Asmussen|2003 ). By Theorem 5.32 in Serfozo 


(2009), E 5 [T(s)] = {S — s)/y. The formula of Hx{y) can be found in §15.3 in Karlin and Taylor 


(1981), where 


Hs(s) = 1 


h{u + y)e ^“dudy. 


10 































Hence, the third term on the right side is the long-run average holding and shortage cost. When 
K{0+) = 0 in (S3), by taking S' — s —)• 0, the long-run average cost of the (s, S) policy converges to 


'y{s, s) = {k + + X I h{u + s)e du 


Since the (s, S) policy turns out to be a base stock policy when s = S, the second expression in 


(3.1) is the long-run average cost under the base stock policy with base stock level s. 


Remark 3.2. The pair (s*, S*) that satisfies (3.2) specifies the reorder and order-up-to levels (which 


may not be unique) for the optimal (s, S) policy. When /i'(O-l-) = 0, the optimal (s, S) policy may 


be a base stock policy whose base stock level z* is specified by (3.3). Since z* < 0, the inventory 


under the optimal base stock policy is maintained above a fixed shortage level. Regulated by the 


(slightly) negative base stock level, the inventory will fluctuate in a neighborhood of zero. By (3.1) 


and (3.3), the minimum long-run average cost is equal to 


-fiz*,z*) = ik + i)fi + hiz*). 


The optimal base stock level can be interpreted as follows. As discussed in ^ the long-run average 
ordering cost must be {k + i)ia under any base stock policy. The optimal base stock policy should 
thus minimize the average holding and shortage cost. As a reflected Brownian motion with a 
negative drift, Z will reach a steady state as time goes by. Let Z(oo) be the steady-state inventory 
level. If the base stock level is s, Z(oo) — s follows an exponential distribution with rate A = 2/i/cr^ 
(see, e.g., Proposition 6.6 in Harrison [2013 ). The resulting long-run average holding and shortage 
cost is given by 


H{s) = / 
Jo 


h{u -|- s) • Ae “ du. 


By setting the first derivative of H equal to zero, the optimal base stock level can be obtained by 


solving (3.3), from which we also have H{z*) = h{z*). Therefore, {k + £)n -|- h{z*) is the long-run 


average cost under the optimal base stock policy. 

Remark 3.3. Although the optimal base stock policy incurs less holding and shortage cost than 
any (s, S) policy with s < S, there may exist some s < S such that the (s, S) policy with these 
parameters incurs less setup cost, i.e., K{S — s)/{S — s) < £. When K{0+) = 0, the optimal reorder 
and order-up-to levels may either satisfy s* < z* < S* or satisfy s* = z* = S*. 

Let 

j/* = inf{AC(x, y) : X E M, Y 

where AC{x,Y) is the long-run average cost given by (2.8). Theorem states the optimality of 
(s, S) policies among all admissible policies. Under the average cost criterion, neither the optimal 
ordering policy nor the minimum long-run average cost depend on the initial inventory level. 


Theorem 1. Assume that the setup cost K satisfies (S1)-(S4) and that the holding and shortage 
cost h satisfies (H1)-(H5). Then, with{s*,S*) determined by (3.2), U{s*,S*) is an optimal ordering 


policy that minimizes the long-run average cost, i.e., v* = 7 ( 5 *, 5*) with 7 given by (3.1). 


The second theorem is a critical result for proving Theoremj^by the lower bound approach, play¬ 
ing an important role in establishing the verification theorem (see Proposition in Q. It implies 
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that an ordering policy that is optimal within the set of policies having order-up-to bounds must 
be optimal in all admissible policies. Since policies subject to order-up-to bounds are analytically 
tractable, it is more convenient to prove the optimality within these policies. 

For m = 1, 2,..., let 


Um = {y GU : Z{t) < m for all ordering time t}, 

which is the set of admissible policies with an order-up-to bound at m. Clearly, U{s,S) G Um if 
s < S < m. Because of the Brownian demand process, it is possible that Z(t) > m under a policy 
in Um if t is not an ordering time. 


Theorem 2. Assume that the setup cost K satisfies (S1)-(S3) and that the holding and shortage 


cost h is nondecreasing on [0, oo). Then, for any admissible policy Y, there exists a sequence of 
admissible policies {Ym G Um : m = 1, 2,...} such that 


lim AC(x, 1^) < AC(x, y). 


(3.5) 


Let U = admissible policies subject to order-up-to bounds. Theorem^ 

implies that a policy that is optimal in U must be optimal in U. We will prove this theorem in ^ 


4 A lower bound for long-run average costs 

In this section, we establish a lower bound for the long-run average cost under an arbitrary ad¬ 
missible policy. This lower bound is specified by differential inequalities with respect to a relative 
value function. In the lower bound approach, such a result is referred to as a verification theorem. 


Proposition 2. Assume that K satisfies (S1)-(S3) and that h satisfies (H1)-(H5). Let f : 


be a continuously differentiable function with f absolutely continuous. Assume that 
f{zi) - f{z 2 ) > -K{zi - Z 2 ) - k{zi - Z 2 ) for zi > Z 2 , 
and that there exists a positive integer d and two positive numbers oq and oi such that 

l/'(^)l < ao forz<0 

and 

|/'(z)| < ao + oi/ forz>0. 


Let r be the generator of X in (2.2), i.e., 


^f{z) = ^a'^f'iz) - fj,f{z). 

Assume that there exists a positive number v that satisfies 

Tf{z) + h{z) > V for 2 ; G M such that f"{z) exists. 


(4.1) 

(4.2) 

(4.3) 


(4.4) 


Then, AC{x,Y) > v for x G M andY gU, where AC{x,Y) is given by (2.8). 
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If we can find an ordering policy whose relative value function satisfies all the assumptions on / 
and whose long-run average cost v satisfies ( |4.4[ ), then by Proposition]^ this policy must be optimal 
in all admissible policies. To prove Proposition]^ we need two technical lemmas about inventory 
processes subject to an order-up-to bound. 

For a given positive integer m, let 


Y^{t)= SUV {m-X{u))+ and {t) = X{t) + {t). (4.5) 

0<w<t 


By Lemma ]^ T™ = {Y^{t) : t > 0} is the base stock policy with base stock level m, under which 
Z"* = t > 0} is a reflected Brownian motion starting from Z"*(0) = x V m with lower 

reflecting barrier at m. The next lemma states that Z™ dominates all inventory processes that 
have an order-up-to bound at m. 


Lemma 2. For a positive integer m, let Z he the inventory process given by (2.1) with Y G Um 
and Z"* the inventory process given by (4.5) with X defined by (2.2). Then, Z{t) < Z^{t) on each 
sample path for all t > 0. 


The marginal distribution of Z”^ can be specified as follows. Let 


t) = P[Z™'(t) > n I X(0) = x] for t > 0 and x > 0. 


Then by (3.63) in Harrison (2013), 'ififf{v,t) = 1 for 0 < x < m and 

N , / —X-|-(x V m) — x(,, / —X — (x V m)-|-, ,, , 

fi^{v,t) = for X > m, (4.6) 

where ‘h is the standard Gaussian cumulative distribution function. Because Z™ dominates all 
inventory processes that have an order-up-to bound at m, we may use its marginal distribution to 
establish boundedness results for policies in U. 


Lemma 3. Let f : 


be a differentiable function and Z the inventory process given by (2.1) 


with Y gU. Assume that there exists a positive integer d and two positive numbers oq and ai such 
that 

\f{z)\<ao + ai\z\'^ forzGR. (4.7) 


Then, 

and 

Moreover, 


^x[\f{Z{t))\]<oo fort>0, 


E. 


f'{Z{u))‘^du < oo fort>0. 


^hm -E4|/(Z(t)).l[o,oo)(^(t))|] =0. 


(4.8) 

(4.9) 

(4.10) 


The proof of Proposition ]^ relies on Theorem ]^ and Lemma ]^ which enable us to establish a 
lower bound for long-run average costs by examining policies in U, instead of all admissible policies. 

Proof of Proposition^ By Theorem]^ it suffices to consider Y G Z7, in which case (4.8)-(4.10) 
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hold. By (2.1), (2.3), and Ito’s formula (see, e.g.. Lemma 3.1 in Dai and Yao||2013a ), 

f{Z{t)) = f{x)+ fTf{Z{u))du + a f f{Z{u))dB{u)+ f f {Z{u)) dY^{u) + V Af{Z{u)) 
Jo Jo Jo 0<u<t 


for f > 0. Then by (4.4), 


f{Z{t))>f{x) + ut-f h{Z{u))du + a [ f{Z{u))dB{u) 

Jo Jo 

+ f f{Z{u))dY'^{u) + E ^/(z(“))- ('i-ii) 


0<u<t 


By (2^) and (^), f'{z) > —{k + i) for z G M, where ^ = oo if iL(0+) > 0. Then by (^) and 

( ^ , 

f{Z{t))> f{x) + i't- f h{Z{u))du + a [ f{Z{u))dB{u) 

Jo Jo 

-ik + i)Y^{t)- Y, {K{AZ{u)) + kAZ{u)). 

0<u<t 

Since AZ{t) = AY{t) and Y{t) = Ylo<u<t i'^) + the above inequality can be written as 


fiZ{t))+ I h{Z{u))du+ Y K{AY{u)) + kY{t)+£Y'^{t)> f{x) + iyt + a J f {Z{u)) dB{u). 

(4.12) 


'0 


0<u<t 


By (4.9) and Theorem 3.2.1 in 0ksendal (2003), 

ft 




[ f{Z{n))dB{u) 
Jo 


= 0 . 


Since (4.8) holds, we can take expectation on both sides of (4.12), which yields 

'-Jo 


E,,[/(Z(t))] +E,,[ / h{Z{u))du+ Y K{AY{u)) + kY{t)+lY^{t)\ >/(x) + pL 


0<ii<t 


Dividing both sides by t and letting t go to infinity, we have 

liminf-Ea; r/(Z(t))l + liminf-Ea; [ h{Z{u))duY K {AY (u)) + kY (t) + £Y^ (t) 

f. ^ V :—f I ^ 


> V. 


0<u<t 


Then, it follows from (2.8) that 


liminf-Ea; r/(.Z’(t))l +AC(x,y) > 


By (4.13), AC(x,y) > p holds when 


liminf -E^[f{Z{t))] < 0. 

t^oo t 


(4.13) 


(4.14) 
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Otherwise, there exists c > 0 such that 


liminf [/(Z(t))] > c. 

t^OC t 


(4.15) 


We next show that AC{x,Y) = oo if inequality (4.15) holds. Hence, AC{x,Y) > u must be true. 


It follows from (4.10) and (4.15) that 


and thus 


limmf ^E^[f{Z{t)) ■ l(_^^o)(^(^))] > c, 


E4/(Z(t)).l(_^,o)(^W)] 


for t sufficiently large. By (4.2), there is some cq > 0 such that \f{z)\ < uol-^l + cq for z < 0. Then, 


E^[\zm > 


ct — 2co 
2ao 


(4.16) 


for t sufficiently large. Since h is convex with h'{z) > 0 for z > 0 and h'{z) < 0 for z < 0, we can 
find ci,C 2 > 0 such that h{z) > ci\z\ — for all z G M. Therefore, 


limsup -Ea: 
t—>oo t 


'-Jo 


h{Z{u)) du 


Cl 

> limsup —Ea 

t^OO t 


\Z{u) \ dn 


- C2, 


where, by (4.16), the right side must be infinite. Hence, we must have AC{x,Y) = oo. 


□ 


Remark 4.1. The boundedness conditions (4.9) and (4.10) are essential to prove Propositionby 
Ito’s formula. More specifically, condition (4.9) ensures that (4.13) holds, condition (|4.10[) 


ensures 


that (4.14) holds as long as the long-run average cost is finite, and the lower bound result follows 


from these two inequalities. Since conditions (4.9) and (4.10) do not hold for all admissible policies, 


Theorem is the critical tool for establishing a lower bound for all of them. In the Brownian model 


studied by Harrison et al. (1983), Ormeci et al. (2008), and Dai and Yao (2013a b), inventory 


is allowed to be adjusted both upwards and downwards. The optimal policy in that setting is a 
control band policy under which the inventory level is confined within a finite interval. Because the 
relative value function under a control band policy is Lipschitz continuous, these authors imposed 
a Lipschitz assumption on / in their verification theorems. This assumption ensures that condition 
(4.9) holds for all admissible policies (which yields ( 4.13[ )) and that condition (4.14) holds when 
the long-run average cost is finite, so one can obtain a lower bound for all admissible policies 
immediately. In our Brownian model, however, only upward adjustments are allowed and the 
optimal policy is an (s, S) policy whose relative value function is not Lipschitz continuous (see 


Remark 6.1 


m 


Without the Lipschitz assumption, conditions (4.9) and (4.14) may no longer 


hold for a general admissible policy, even if we assume the associated long-run average cost to be 


finite. In this case, Wu and Chao (2014) and Yao et al. (2015) restricted their scope to the subset 


of policies that satisfy (4.9) and (4.14). Their lower bounds are established within this subset, and 


consequently, their proposed policies are proved optimal within the same subset. Theoremin the 
present paper enables us to establish a lower bound for all admissible policies. We can thus prove 
the proposed policy to be globally optimal. 
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5 Relative value function and long-run average cost 


In order to prove the proposed policy to be optimal, let us first analyze the long-run average cost 
under an arbitrary (s, S) policy. An important notion for the analysis is the relative value function 
under the (s, S) policy. The relative value function and the associated long-run average cost jointly 
satisfy an ordinary differential equation with some boundary conditions. 


Proposition 3. Assume that h satisfies (H1)-(H5). For any pair {s,S) with s < S, there exists a 
positive number v and a twice continuously differentiable function V : M —)■ M that jointly satisfy 


TV (z) -I- h{z) = u for z G M 


(5.1) 


with boundary conditions 


and 


(V{S)-V{s) =-K{S - s) - k{S - s) ifs<S, 
\v'{s) = -{k + i) ifs = S, 

lim e~^^V'{z) = 0 for a > 0 . 


Specifically, the solution to (5.1 )-( [K3| ) is v = 'y{s,S), where 7 ( 5 , 6 ') is given by (3.1), and 

V{z) = -— — ^ ^ f f h{u + y)e~^'^dudy, 

k-Js Jo 


(5.2) 


(5.3) 


(5.4) 


where V is unique up to addition by a constant. Assume that K satisfies (S3) if s = S. Then, 
AC{x, U{s, S)) = 7(5,6), i.e., 7(5,6) is the long-run average cost under the (5,6) policy. 

Remark 5.1. For z > s,V(z) can be interpreted as the cost disadvantage of inventory level 2 : relative 
to the reorder level 5. Under the (5,6) policy, T{s) defined by (3.4) can be interpreted as the hrst 
ordering time, given that Z{0—) = z, and Hffs) is the expected holding and shortage cost during 
[0,r(5)]. Following the arguments in Remark |3.1[ we have 


h{u + y)e ^"^dudy 


E^[r( 5 )] = ^—- and Hffs) = - 

k kJs Jo 



By (5.4), V{z) can be decomposed into 

V{z) = Hffs)-v¥.ffT{s)]. 


In this equation, Hffs) is the cost disadvantage of a system starting from time zero with initial 
level Z{d—) = 2: compared with a system starting from time T{s) with initial level Z{T{s)—) = 5, 
while V ■ E^[r( 5 )] represents the cost disadvantage of a system starting from time zero compared 
with the delayed system starting from time T{s). As the difference between these two costs, U( 2 ;) 
represents the relative cost disadvantage of inventory level 2; compared with the reorder level s. 


Proof of Proposition^^ We obtain the explicit solution (i/, V) to the boundary value problem (5.1 )- 
as follows. If such a solution exists, write g{z) = V'{z) for 2 ; G M. By (5.1) and (5.3), g satisfies 
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the following linear first-order ordinary differential equation, 


with boundary condition 


^ , \h(z) Xv 

g'{z) - \g{z) = -^ + — 

/i /i 


lim e °‘^g{z) = 0 for a > 0 . 
2^00 


Since h is polynomially bounded, for each z/ E M, the above equation has a unique solution 

\ poo 

g{z) = -h - / h{y + z)e~^y dy, 

Jo 

which yields (5.4). By the boundary condition (5.2), we obtain ly = 7 ( 5 , 5). 

It remains to show AC(x, U (s, S)) = 7 ( 5 , S). By ( 2 . 1 ), (2.3), (5.1), and Ito’s formula, 

V{Z(t)) = V{Z(0)) + ut — f h{Z{u))du + a f g{Z{u))dB{u) 

Jo Jo 

+ f g{Z{u))dY%u)+ ^V{Ziu)). 

0<u<t 


(5.5) 


Under the {s,S) policy with s < 5, it follows from (5.2) that AV{Z{u)) = —K{S — s) — k{S — s) 
whenever AZ(u) > 0 and u> 0. Since = 0 for t > 0, equation (5.5) turns out to be 

V{Z{t)) = V{Z{0)) + ut- [ h{Z{u))du + a [ g{Z{u))dB{u) - K{S - s)J{t) - k{Y{t) 

Jo Jo 


where J{t) is the cardinality of {u G (0,t] : AY{u) > 0}. By (4.9), 

E^[V{Z{t))]=E^[V{Z{0)) + kY{0)] + ut-Ej [ h{Z{u)) du + K{S - s)J{t) + kY{t)\. (5.6) 

L JO 

When s = S', by Lemma[^ both Y and Z have continuous sample paths. By (5.2) and (5.5), 

V{Z{t)) = V{Z{0)) + ut- [ h{Z{u))du + a [ g{Z{u)) dB{u) - {k + i)Y^{t). 

Jo Jo 

Since Y‘^{t) = Y{t)— Y{0), taking expectation on both sides, we obtain 


E^[ViZ{t))]=E^[ViZ{0)) + kY{0)] + i^t-E^ / h{Z(u)) du + kY{t) + £Y<^(t) 

L JO 


(5.7) 


Under the (s, S) policy with s < S, y(0) < |S — x| and x A S < Z(0) < x V S. It follows that 

lim -E^[V{Z{d)) + feU(O)] = 0. 

t^oo t 

Because \V{Z{t))\ < \V{Z{t)) ■ l[o,oo)('^(^))l +max{|U(z)| : (s A 0) < 2 ; < 0}, we obtain 

lim lE,[V{Z{t))] = 0 
t^oo t 
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by (4.10). Then, it follows from (2.8) and (5.6)-(5.7) that AC{x, U{s, S)) = 7 ( 5 , 5). 


□ 


Remark 5.2. When the setup cost is constant for any order quantity, the optimal reorder and 
order-up-to levels can be obtained by adding a smooth pasting condition 


V'{s*) = V'{S*) = -k, 


(5.8) 


where, with slight abuse of notation, V should be understood as the relative value function under 


U{s*,S*); see Bather 


Taksar (1985), and Sulem (1986) for the interpretation of the smooth 


pasting condition. This condition, together with (5.1)~(5.3), defines a free boundary problem by 
which (s*. S'* 


can be uniquely determined. In our Brownian model, however, the general setup cost 
function has imposed a quantity constraint on each setup cost value. With these constraints, the 


smoothness condition (5.8) may no longer hold at the free boundary. In other words, the smooth 


pasting method cannot be used for our problem. 


6 Optimal ordering policy 


The optimality result is proved in this section. We first confine ordering policies to the (s, S) type, 
proving the existence of the optimal (s, S) policy. Then, we show that the relative value function 
associated with the optimal (s, S) policy and the resulting long-run average cost jointly satisfy the 
conditions in the verification theorem, thus proving Theorem by the lower bound approach. 

We establish a series of lemmas to prove Proposition and Theorem In particular, the 
following function qq ^ ]R_|_, defined by 

A 

9oiz) = - h{y + z)e~^y dy, (6.1) 

M Jo 

is frequently used in the analysis. The first derivative of go is 

9o{z) = -{\[ h{y + z)e~^y dy - h{z)) , (6.2) 

and go is a solution to the linear first-order ordinary differential equation 

\(^‘^9'o{^)-99o{z) + h{z) = d. (6.3) 


Using the derivative, we specify the monotone intervals of go in the following lemma. 


Lemma 4. Assume that h 


satisfies (H1)-(H5). 


Then, 


lirn go{z) = 00 
z^±oo 

and 

g'oiz) <0 for z < z*, 

< g^iz) = 0 for z = z*, 
,5o(^) > 0 forz>z*, 

where z* is uniquely determined by (|3.3|) and is less than zero. 


(6.4) 


(6.5) 
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Remark 6.1. The relative value function V given by (5.4) satisfies V'(z) = + goi^), so by 

(6.4), V is unbounded. This implies that the relative value function for an (s, S) policy is not 
Lipschitz continuous. 

If each order quantity is fixed at ^ > 0, the optimal reorder and order-up-to levels will be 
determined by minimizing the holding and shortage cost. We use s(^) to denote the optimal 
reorder level, and the corresponding order-up-to level is 5(^) = s(^) If a base stock policy is 
used, we use s(0) to denote the optimal base stock level, in which case 5(0) = s(0). For ^ > 0, s(0 
is specified in Lemmas [ 


Lemma 5. Assume that h satisfies (H1)-(H5). Then for each ^ > 0, there exists a unique 
(s(^),5(^)) G such that 


This solution satisfies 


9o{s{C)) = go{S{C)) and 5(0 = s(0 + ?- 


5(0 <^*<5(0 for^>0, 

lim s(^) = —oo and lim 5(^) = oo. 

^—>■00 g—>-00 


Moreover, both s and S are differentiable on (0,oo), with 

s'(^) < 0 and S\^) > 0 for ^ > 0. 


( 6 . 6 ) 

(6.7) 

( 6 . 8 ) 

(6.9) 


The value of s(^) is determined by (6.6) for > 0. Taking s(0) = z*, we extend the domain of 
s to [0,oo). For notational convenience, let us write 

^(sjO = 7('S) s-I-0 for s G M and ^ > 0, (6.10) 

which is the long-run average cost with the reorder level fixed at s and the order-up-to level fixed 
at By (3.1) and (6.1), 


7(s,0 = < 


kg -I- 


KiOh , T 






{k + e)g + ggo{s) 


goiy)dy if^>0, 
if ^ = 0. 


For ^ > 0, let 


0(0 = inf{7(s,0 : « e 


( 6 . 11 ) 


( 6 . 12 ) 


which is the minimum long-run average cost when the quantity of each order is fixed at ^ (a base 
stock policy is used if ^ = 0). The next lemma says that this minimum cost can be attained by 
setting the reorder level at s(0- In addition, 9 is lower semicontinuous. 


Lemma 6. Assume that K satisfies (SI), (S3), and (S4), and that h satisfies (H1)-(H5). Then, 

forC>0, (6.13) 


where s(0) = z* and s(0 is determined by (6.6) for (^ > 0. Moreover, 6 is lower semicontinuous 
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on [ 0 , oo) and satisfies 


lim 0{^) = oo. 

^—^OO 


(6.14) 


Proof of Proposition^^ By (6.10) and (6.12), 

inf{ 7 (s, 5) : s < S'} = inf{0(^) : f, > 0}. 


To prove (3.2), we need to show that there exists ^* > 0 such that = inf{0(^) : ^ > 0}. 

By (6.14), if exists, there must be some M < oo such that < M. Because 9 is lower 


semicontinuous on [0,oo), by the extreme value theorem (see, e.g.. Theorem B.2 in Puterman 


1994[ ), there exists ^ G [0,M] such that 0(^) < 9{f,) for all f, G [0,M]. Hence, = f, must be a 

holds. 


minimizer of 9. Taking s* = s(^*) and S* = S(^*), we deduce that 

Lemmaprovides the properties of z*. If K(0+} > 0, we obtain 0(0) = j(z*,0) = oo because 
i = oo. This implies that > 0, and by (6.7), we obtain s* < z* < S*. If K{0+) = 0, it may 
happen that = 0, in which case s* = z* = S* since s(0) = z*. If iL(0+) = 0 and > 0, we have 
s* < z* < S* again by (6.7). □ 

It remains to prove the global optimality of U(s*,S*) using the verification theorem. Under 
this policy, the long-run average cost in (3.1) and the relative value function in (5.4) satisfy all 
conditions specified in Proposition except for (4.2). The relative value function should thus be 
modified to fulfill this condition. To this end, we establish the following lemma. 


Lemma 7. Assume that K satisfies (S1)~(S4) and that h satisfies (H1)-(H5). Then, there exists 
s G (—oo,z*) such that 


where 


> 7 (s*, S*) for s G M and ^ > 0, 


7(s,0 = kp+ + I 


00 ( 2 /Vs) dy. 


The modified relative value function is defined by 

7(s*,5*) 


V*(z) = — 




{z-s)+ 5 o(yVs)dy for 2 ; G 


(6.15) 

(6.16) 

(6.17) 


with which we are ready to present the proof of Theorem 

Proof of Theorem\^ Let us show that ( 7 ( 5 *, S'*), U*) satisfies all conditions specified in Proposi¬ 
tion]^ so U{s*,S*) is an optimal ordering policy. Clearly, V* is twice differentiable except at s. 
By (|^-([ 6 ^, 

7 (s, 0 = 1 {kC + K{C) + U*(s + 0 - ^*(s)) + 7(s*, S*) for s G M and ^ > 0. 


Then by (6.15), 


V*{s + 0-V*is)>-K{0-k^, 


which implies that V* satisfies (4.1). By (6.5), go{z*) < go{z V s) < ( 50 ( 5 ) V ^ 0 ( 0 )) for 2 : < 0, from 
which condition (4.2) follows. Condition (4.3) holds because h is polynomially bounded. For z > s, 
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it follows from (6.3) and (6.17) that 


1 


TV*{z) + h{z) = -a Qoiz) - ngoiz) + h{z) + j{s*, S*) = 7 ( 5 *, S*). 


For z < s, g'gizV s) = 0. Since s < z*, it follows from (6.5) that 5o(^) < 0 and go{z) > go{s). Then, 
rv*{z) + h{z) = -ggo{s) + h{z) + j(s*, S*) > ^a‘^go{z) - fJ,go{z) + h{z) + j(s*, S*) = 7 ( 5 *, S*). 


Hence, ( 7 ( 5 *, 5*), F*) satisfies condition (4.4). 


□ 


7 Policies subject to order-up-to bounds 

This section is devoted to the proof of Theorem Let Y be an admissible policy. We first modify 
this policy to construct a policy Ym G l^m, where m is a fixed positive integer. Then, we prove that 
{Ym G Um : m = 1, 2,...} has a subsequence that satisfies (3.5). 

For each Y, we would construct a policy Ym G V(m that incurs less holding and shortage cost and 
less proportional cost. As m goes large, the average setup cost under Ym should be asymptotically 
dominated by that under Y. Although by imposing an order-up-to bound, we can easily construct 
a policy that maintains a lower inventory level, we must make additional adjustments to ensure 
that the shortage level under Ym will not be higher. Such a policy is constructed as follows. 

Let Ym be the continuous part of Ym- Under Ym, the inventory level at time t is 


Zm{t) — ^(t) + Ym{t), 


(7.1) 


where X{t) is given by (2.2) and 


Ymit) = AYmiu). 

0<u<t 


The continuous part of Ym is constructed by 

T^(f)= [\^_^,m]{Zm{u-))dY^{u), (7.2) 

Jo 

where Y‘^ is the continuous part of Y. On each sample path, Ym may have a jump either at a jump 
time of T or at a hitting time of zero by Zm- Let Jm = {1 > 0 : AYm{t) > 0} be the set of jump 
times of Ym, J = {t > 0 : AY(t) > 0} the set of jump times of Y, and Im = {t A 0 ■ Zm{t—) = 0} 
the set of hitting times of zero by Zm- Then, Jm C JU Im- The size of each jump of Ym is specified 
as follows: 

(Jl) AYm{t) = 0 for t G J, if Zm{t—) > m/2; 

(J2) AYm{t) = AY{t) for t G J, if Zm{t—) < m/2 and Zm{t—) + AY{t) < m; 

(J3) AYm{t) = m — Zm{t—) for t G J, if Zm{t—) < m/2 and Zm{t—) + AY{t) > m; 

(J4) AYm{t) = {Z{t) A m)~^ for t G Im \ J, where Z is the inventory process under policy T. 

In other words, Ym does not make jumps when the inventory level is above m/2. If the inven¬ 
tory level is below m/2, Ym has simultaneous jumps with Y. Each simultaneous jump takes the 
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Figure 1: A sample path of the inventory process under policy Y and the corresponding sample 
path under the modified policy Ym, with Z given by (2.1) and Z^ given by (7.1). 


corresponding jump size of Y, as long as the inventory level will not exceed m after that jump; 
otherwise, the simultaneous jump will replenish the inventory level to m. In addition, Y^ may have 
jumps when the inventory level reaches zero. In this case, it will replenish the inventory level to 
Z(t) Am, if the inventory level of the system under policy Y satisfies Z(t) > 0; otherwise, Y^ does 
not make a jump. 

The above policy construction procedure is illustrated in Figure [T} We plot a sample path of 
the inventory process under policy Y and the corresponding sample path under policy Y^- We use 
the blue curve for the inventory process under Y, the dashed red curve for that under and the 
black curve for their identical parts. The type of each jump is indicated beside the jump point. In 
addition to these jumps, we assume that Y‘^ increases over time intervals (Cl) and (C2). As Z^ is 
below m over (Cl), Y^ and Y^ have the same increments during the time; however, Y^ does not 
increase over (C2) while Z^ is above m. 

The following lemma states that compared with policy Y, the modified policy Y^ maintains a 
lower inventory level and the same shortage level. 


Lemma 8. Let Y be an admissible policy. For a fixed positive integer m, let Ym be the policy 
constructed according to (7.2) and (@-(@. Then, Zm{t) < Z{t) for all t > 0 on each sample 
path, where Z is the inventory process under policy Y. In particular, Zm{t) = Z{t) when Zm{t) < 0. 


Lemma implies that the modified policy Ym incurs less holding and shortage cost and less 
proportional cost than policy Y. To prove the comparison theorem, we should also establish 
asymptotic dominance between the average setup costs incurred by these two policies. 


Proof of Theorem^ Because h is nondecreasing on [0,oo), it follows from Lemmathat 

f h{Zm{u)) du < f h{Z{u))du for t > 0, 

Jo Jo 


i.e., Ym incurs less holding and shortage cost than Y. Since Zm{t) < Z{t), we have Ym{t) < Y(t), so 
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Ym incurs less proportional cost, too. By (7.2), Y^{t) < Y^{t) for all t > 0. Then, if i('(0+) = 0, we 


obtain iY^{t) < iY^{t), i.e., Y^ incurs less setup cost than Y^^. If it'(0+) > 0 and there exists some 
t > 0 for which Y^{t) > 0, the cumulative setup cost incurred by Y^ must be infinite. Therefore, 
we only need to consider the setup costs incurred by jumps. 


When a jump of type (J2) is made by Ym, the setup cost is equal to the cost incurred by the 
simultaneous jump of Y. 


Consider two consecutive jumps of type (J3). Let ti and t 2 be their respective jump times with 


0 < ti < t 2 - Because X has continuous sample paths and Ym, is nondecreasing, it follows from (7.1) 
that X{ti) — X{t 2 ) > Zm{ti) — Zm{t 2 —) > m/2. Let 

{ TTl '1 

u G ( 0 ,t 2 - h] : X{ti+u) = X{ti) - 

By the strong Markov property of Brownian motion, has the same distribution as 

{ TTl 'I 

u > 0 : —fiu + aB{u) ~ 

where B is a standard Brownian motion starting with B{0) = 0. Because r is the first hitting time 
of —m/2 by a Brownian motion with drift —/x, we obtain E 3 ;[t 3 ] = m/{2fj.). Let be the 


number of jumps of type (J3) made by Zm up to time t. Because t 2 — ti> t^, it follows that 


m 


Now consider two consecutive positive jumps of type (J4). Let ti and t 2 be their respective 


jump times with 0 < ti < t 2 - We would like to show that there exists some to G [^ 1 ,^ 2 ) for which 
Zm{to) > m/2. Since AYm{t 2 ) > 0, Zm{t 2 —) 7 ^ Z{t 2 —). If Zm{ti) = Z{ti), to must exist because 


otherwise, Ym can only have jumps of type (J2) during (ti,t 2 ) and this yields Zm{i 2 —) = Z{t 2 —), 
a contradiction. If Zm{ti) / Z{ti), we have Zm{ti) = m and thus set to = Therefore, 

Zm{to) > m/2 holds for some to G [ti,t 2 )- Let 


m 


ts = inf G (0, t 2 - to]: X {to + u) = X (to) - — 




which also satisfies IEa;[t 3 ] = m/(2/x). Let Nm, 2 {t) be the number of positive jumps of type (J4) 
made by Zm up to time t. Because to < t 2 — ti, we have 

E.[iV„^,2(t)] < ^ + 1. 

m 


Put K = sup{iL(^) : I > 0}, which is finite by (S2). By the discussion above. 


1 - 4u7L 

AC(x,ym) - AC(x,y) < limsup+ A^m, 2 (i)]Ar < -, 

t^oo t m 


from which we deduce that 

limsup AC(x, Tm) < AC(x,y). 

m^oo 

By the Bolzano-Weierstrass theorem, {AC(x, Ym) : m = 1, 2,...} has a convergent subsequence, so 
inequality (3.5) holds. □ 
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8 Optimal ordering policy with a step setup cost function 


By Proposition and Theorem we need to solve the following nonlinear optimization problem 
to obtain the optimal ordering policy, 


min 7 ( 5 , S) 
s.t. s < S, 


( 8 . 1 ) 


where 7 ( 5 , 5), given by (3.1), is the long-run average cost under the {s,S) policy. With a setup 


cost function that satisfies (S1)-(S4), one may solve ( 

8.1 

) numerically by a standard grid search or 

a random search (see, e.g., Chapter 4 in 

Hendrix and G.-T 6 th 

|2010| . When the setup cost function 


takes certain forms, it is possible to obtain the optimal solution in a more efficient way. In this 
section, we consider the optimal ordering policy when the setup cost is a step function satisfying 


(Sl)HS4),i.e. 


N 


N-1 


K{0 = Y.Kn- 1(Q„_„Q„)(0 + Y.^Kn^ K^+l) • 1{Q4(0 for e > 0, 


( 8 . 2 ) 


n=l 


n=l 


where iV is a positive integer, 0 = Qo < Qi < • ■ ■ < Qn-i < Qn = 00 , and Ki,..., Kn are 
nonnegative real numbers with Kn 7 ^ K^+i for ra = l,...,A^—1. The setup cost is Kn for any order 
quantity within the open interval {Qn-i, Qn)- When the order quantity is Qn for n = 1 ,..., iV — 1 , 
we assume that the buyer is required to pay the lower fee of Kn and Kn+i- This step function 


encompasses most setup cost structures in the literature and in practice, e.g., those in (1.2)-(1.3). 


When the step setup cost function in (8.2) has Ki = 0, by placing small orders, the inventory 
system can be exempt from setup fees without incurring additional holding and shortage cost. In 


this case, we may assume the setup cost to be a zero function. As we discussed in Remark |3. 2 1 the 
optimal policy will be a base stock policy whose base stock level is fixed at z*. When the setup cost 
function in (8.2) has Ki > 0, by Theorem[^and Proposition!^ the optimal reorder and order-up-to 
levels must satisfy s* < S*. We may follow a five-step procedure to obtain the optimal parameters. 


Step 1: Obtain z* by solving the integral equation (3.3). If = 0 in ( 8 . 2 ), taking s* = S* = z*, 


we obtain an optimal ordering policy, which is a base stock policy with base stock level z*. 
The minimum long-run average cost is u* = j{z*,z*) = kfj, + h{z*). Proceed to steps [2j|^ if 
and only if ATi > 0 . 


Step 2 : Let 


/ OO 

l(-oo,y] {9o{u))du, 

■OO 


(8.3) 


where go is given by (6.1). For n = 1,..., A^, obtain by solving the integral equation 

l--k + 0n/fJ, 

J h{z*)l fi 

Then, obtain (sn, Sn) by solving 


A(tt) du = K„ 


go{sn) = go{Sn) = -k + 
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Put ^ji — Sji Sfi- 

Step 3: Define three index sets 


AA< = {n = l,...,iV 
= {n = l,...,N 
Afy = {n = l,...,N 


Cn ^ Qn—l }) 

Qn—1 ^ Cn ^ Qn}) 
Cn ^ Qn}' 


Put 


£* = < 


Qn-i 


in 

Qr 


if n G A/h, 
if n G My. 


Then, define the candidate index set by 

M = {n = l,...,N-.K{C) = Kn]. 

Step 4: For n G M=, let 

Sn — Sm Sn = Sm l^n — i'n- 

For n £ M \ M=, obtain {sn, Sn) by solving the system of equations 


and put 

Step 5: Let 
Taking 


Sn Sn — in^ 
9o{Sn) — 9o{Sn)j 


Un = k^i-\ -—-£ 7“ / 9o(u) du. 


£* ' £ 
Sn S: 


★ 

n j Sn 


n* = min{n £ M Vn ^ for all i £ M}. 


s* = s. 


and 5* = Sn 


(8.4) 


(8.5) 


( 8 . 6 ) 


(8.7) 


( 8 . 8 ) 


(8.9) 


( 8 . 10 ) 


we obtain the optimal ordering policy, which is an {s,S) policy with reorder level s* and 
order-up-to level S*. The minimum long-run average cost is v* = 7 ( 5 *, S*) = r'n*, where Un* 


is given by (8.8)-(8.9) 


The following corollary of Theorem states the optimality of the obtained ordering policy. 


Corollary 1. Assume that the setup cost K is given hy (8.2) and that the holding and shortage 


cost h satisfies (H1)-(H5). If Ki = 0, the base stock policy U{z*,z*) is an optimal ordering policy 
that minimizes the long-run average cost, i.e., n* = 'y{z*,z*) = kp. + h{z*), where 7 is given by 
(3.1). If Ki > 0, with (s*,5*) uniquely determined by steps [H of the above algorithm, U{s*,S*) 
is an optimal ordering policy that minimizes the long-run average cost, i.e., v* = 7 ( 5 *, S'*). 

Let us illustrate how the five-step algorithm yields the optimal policy. We obtain z*, the 
minimizer of < 70 , at step By Proposition and Theorem the optimal policy is a base stock 
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policy if = 0, with z* being the optimal base stock level. If i^i >0, the optimal policy is of the 
(s, S) type with s < S, and we obtain the optimal reorder and order-up-to levels by steps [2]-[^ 
Assuming the setup cost is a constant Kn for any order quantity, we find the optimal reorder 
and order-up-to levels (s^, Sn) for n = 1,..., in step Under this policy, the quantity of each 
order is and the long-run average cost is i)n- The uniqueness and optimality of the obtained 
policy can be deduced from the following lemma. 

Lemma 9. Let k be a nonnegative number. Assume that K{^) = k for all > 0 and that h satisfies 
(H1)-(H5). Then, there exists a unique ^ > 0 such that 


9{f,) = inf{7(s,5) : s < S'}, 


( 8 . 11 ) 


where 7 is given by (3.1) and 9 is given by (6.13). In particular, ^ = 0 if and only if k = 0. Write 

h = 9{i), s = s{i), S = Sii), (8.12) 


where s and S are defined by (6.6). Then, v is the unique solution to 




A{u)du = K and 1 ) > kfj. + h{z*), 


(8.13) 


where A is given by (8.3) and {s,S) is the unique solution to 


9o{s) = go{S) = -k + 

/i 

Moreover, f,, S, and k are strictly increasing in k, whereas s is strictly decreasing in k. 


(8.14) 


Remark 8.1. With a constant setup cost. Bather (1966) identified a set of necessary and sufficient 
conditions for the optimal (s, S) policy that minimizes the long-run average cost. Those conditions 
are equivalent to (8.13)-(8.14) in Lemma see (4.2)-(4.4) and (5.4)-(5.5) in |Bathei] ( |1966[ ). In 
particular, our Brownian control problem is reduced to Bather’s problem when = 1 in (8.2). 

The next lemma is a technical result for proving Lemma and Corollary It specifies how 
9, the minimum average cost function, changes with the order quantity when the setup cost is 
assumed to be constant. 

Lemma 10. Let k be a positive number. Assume that A'(^) = k for all > 0 and that h satisfies 
(H1)-(H5). Then, there exists a unique ^ > 0 such that L{fi) = k where 


rS(i) 

H0= {go{s{0) - 9o{y)) <iy- 


Moreover, the first derivative of 9 satisfies 


0'(O<O /or0<e<e, 
0'(O = O for^ = i 
0'{O>O for^>i. 


(8.15) 
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Let Ki be the smallest one of iLi ,... ,K]\f. By Lemma Oi is the smallest one of , ujsf. If 

K{^i) = Ki happens to hold, Si) must be the optimal (s, S) policy. The setup cost function 


in (8.2), however, has imposed a constraint on order quantities for each setup cost value. When 
the setup cost is Kn, the quantity of an order is confined to an interval from Qn-i to Qn (which, 
by d^ , may be {Qn-i,Qn), (Qn-i,Qn], [Qn-i,Qn), or [Qn-i,Qn])- If < Qi-1 or ^i > Qi, we 
have K{^i) / Ki, and in either case, U{si, Si) may not be optimal. 

With a quantity-dependent setup cost, it is necessary to examine whether each falls into 


the interval {Qn-i, Qn)', if not, we should adjust the order quantity to make it conform with (8.2). 
At step based on the relative position of in to the interval {Qn-i,Qn), we define as the 
point in [Qn-i,Qn] that is the closest to By Lemma 10 is the optimal quantity when each 
order is confined in [Qn-i,Qn] with setup cost Kn- Consequently, one of ^i, - ■ ■ niust be the 


optimal order quantity for the setup cost given by (8.2). We may thus seek the optimal policy by 
examining the policies that fix order quantities at for n = 1,..., N. In this procedure, rather 
than examining all of ) ■ ■■we may just investigate those in the candidate index set Af defined 


by (8.5). We will discuss the candidate index set shortly. 


For n G M , we obtain the optimal (s, S) policy with the quantity of each order fixed at This 
task is carried out at stepj^ where the reorder and order-up-to levels are given by {sn, Sn) and the 
resulting long-run average cost is given by Un- When the quantity of each order is fixed at with 
setup cost Kn, U{sn, Sn) must be the optimal policy for n G M=, and the long-run average cost is 
equal to Vn- We may thus define [sn, Sn, t'n) for n G A/L by (8.6). By Lemmas [5||^ we can obtain 
(sn, Sn, Vn) by Solving (8.7)-(8.8) for n G M\M=. Note that not all of Cii • • • are considered at 
step 1^ The next lemma implies that it suffices to search for the optimal policy within the candidate 
index set N. To state this lemma, let us define 


and 

For n = 1,..., put 

where 


x{n) = max{j G AA : j < n} for re G A/’< \ M 

x(re) = min{j G AA : j > re} for re G A/’> \ M. 
f'n = GniCi) 


K u u 

0n{i) = k^i+ + 7 / 9o{y) dy for ^ > 0. 

? 4 JsiO 


(8.16) 

(8.17) 

(8.18) 

(8.19) 


Note that for re G Af. 


Lemma 11. Assume that the setup east funetion in (8.2) has Ki > 0 and that h satisfies (Hl)- 


(H5). Then, for each re G A/’< \Af, x(™) defined by (8.16) exists and satisfies I'xin) < ^nt for each 


re G A/’> \Af, Xn defined by (8.17) exists and satisfies 


It follows from Lemma 11 that re* < re„ for all re = 1,..., A^, where re* is given by (|8.10|) at 

us 


step Hence, U{s*,S*) is the best candidate of the policies obtained by steps m Now let 
prove the optimality of the obtained ordering policy. 


Proof of Corollary^ It suffices to show that (s*,5*) obtained by steps in satisfies ( |3.2[ ). Then, 
the corollary follows from Theorem 
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Without loss of generality, we may assume K(^) = 0 for ^ > 0 if Ki = 0. Then, it follows from 
Lemmaj^that ^ = 0 and thus 5(0) = 5(0) = z* is the optimal base stock level. Hence, s* = S* = z* 
and by and ( |3.3[ ), v* = k^ + h{z*). 


Consider the case Ki > 0. If s = S, it follows from (3.1) that 7 ( 5 , 5) = 00 because £ = 00 hy 


(2.5). If s < 5, put ^ = S — s and assume that K{^) = Kn for n = 1,..., N. By (8.4), (8.11), and 


(8.15), we obtain 0^(0 > ^n(?n), where 9n is given by ( |8.19 ). If n G M, 

'r{s,S) > Onii) > OniO = Z^n > , 


where the hrst inequality follows from ( 6 . 10 ) and ( 6 . 12 ), the first equality follows from ( 6 . 6 ) and 


(8.7)-(8.8), and the last inequality follows from (8.9). If n 0 Af, we obtain 


7 ( s , 5 ) > OniO > ^niO = ^n> , 


where the equality follows from (8.18) and the last inequality follows from Lemma 11 


□ 


9 Conclusion 

The optimality of (s, S) policies for inventory systems with constant setup costs is a fundamental 
result in inventory theory. Assuming a Brownian demand process, we have extended the optimality 
of (s, S) policies to stochastic inventory models with a general setup cost structure. To achieve 
this, we proved a comparison theorem that allows one to investigate the optimal policy within 
a tractable subset of admissible policies. When the setup cost is a step function, we proposed 
a policy selection procedure for obtaining the optimal control parameters. These results have 
improved the widely used lower bound approach for solving Brownian control problems and may 
apply to inventory models with even more general stochastic demand process, e.g., mean-reverting 


diffusions (see Cadenillas et al. 2010) and spectrally positive Levy processes (see Kyprianou 2006 


and Kuznetsov et al. 2012). We look forward to exploring these extensions in future work. 


Technical proofs 


Proof of Lemma\^ By Lemma[^ Z'^{t) > m for t > 0, so Z{t) < Z^{t) whenever Z{t) < m. For 
a fixed t > 0, if Z{u) > m for all u G [0,t], we must have Y{t) = 0 and thus X{u) > m for all 
u G [0,t]. It follows that Y'^{t) = 0 and thus Z{t) = Z'^{t) = X{t). If Z{t) > m but there exits 
some u G [0,t) such that Z{u) < m, we put to = supju G [0,t) : Z{u) < m}. We deduce that 
Z{to) < m because otherwise, Z{to) > m and Z[to—) < m, which contradicts the assumption that 
Y G Um- Hence, Z{to) < Z'^fto) and to < t. Because Z(u) > m for u G {to,t], Y{t) — Y{to) = 0. 
By (2.1)~(2.2), Z{t) = Z{to) + X{t) — X{to). Because T™ has nondecreasing sample paths. 


Z^{t) = Z^{to) + Xit) - X{to) + Y^{t) - Y^ito) > Z{t). 

Therefore, Z{t) < Z'^{t) for all t > 0. □ 

Proof of Lemma\^ It suffices to consider Y G Um for a fixed positive integer m. Let Z^ be the 


inventory process given by (4.5), which is a reflected Brownian motion with lower reflecting barrier 
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at m. Let a be a positive number. By (4.6), 


E. 


1*00 

Jo 

< (xV m)°‘ + a 

J X 


cxWm 


< a 


,,a— 1 


“ + (a:^ V m) - 

v°‘ 


/ xVm 




1*00 

dv + a dv 

J xWm 

jdv + a / d^;. 

' IxWm 


For t > 0 and v > x V m, we obtain 


—v + {x\/m) — fj,t i'v — {x\/m) ^ ^ (u — (x V m)) 






+ 


a 


1/2 


a 


using the inequality of arithmetic and geometric means. Therefore, 

1 1/2 


roo . — ixM mSV ^ \ r°° 

E,,[Z™(t)“] < (xVm)“ + a / ?;“-i$( --F 1^- ’-1 — \dv + a du 

Jx\jm (T / JxVm 


for t > 0. All terms on the right side are finite and none of them depend on t, so 

supEa,[Z™’(t)“] < oo for a > 0. 
i>0 

Because X{t) < Z{t) < Z^{t) for t >0, 

|Z(t)|“ < |A(t)|" + Z”^(t)“. 

Since X{t) follows a Gaussian distribution with mean x — /it and variance 

sup Ea;[|A(ri)|“] < oo for t > 0. 

0<u<t 


By (4.7), there exist cq > 0 and ci > 0 such that 

\f{z)\ < Co + Ci\z\^^^ for z G M. 


(A.l) 


(A.2) 


(A.3) 


(A.4) 


Then, we deduce that (4.8) holds from (A.1)-(A.4) and that (4.9) holds from (4.7), (A.1)-(A.3), 
and Tonelli’s theorem. Since Z{t) < Z"^{t) and Z^{t) > m, it follows from (A.4) that 

\f{Z{t)) • l[0,oo)(^(/))| < C0+CiZ^{tf^\ 

which, along with (|A.l ), implies that (|4.10|) holds. □ 


Proof of Lemma\^ By (H2)-(H4), lim^^±oo h{y + z) = oo, from which (6.4) follows. By (6.2), 

\3 foo \2 \ 

9o{^) — — / h{y + z)e~^'^ dy - h{z) - h'{z) for z / 0. 

9 Jo ^ ^ 


We would show that 


(7o(^) >0 for z > 0, 
fi(o(z) >0 for z < 0, 


(A.5) 

(A.6) 
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and 


limsup ggiz) < 0. 


(A.7) 


By these conditions and the continuity of g'g, we obtain (6.5) with z* < 0. Moreover, z* is unique. 

(A.8) 


Write (|6.2|) into 


\2 roo 

9o(z) = — {h{y + z)- h{z))e~^y dy. 


9 Jo 


Then, condition (A.5) follows from (H4). By (HI) and integration by parts, 

h'{y + z)e~^'^ dy for z < 0. 


^ \{y + z)e-^y dy = ^ ^ ' 


It follows that 
A3 


9oi^) > — [ Hy + ^)^ dy -—h{z) --h'{z) = — [ h\y + z)e dy - -h'{z). 

9 Jo 9 9 9 Jo 9 


Since h is convex, h'{y) > h'{z) for y > z. By (H4), 


9oiz) > — [ h'{z)e dy — —h'{z) = ——h'{z)e^^ > 0, 


9 Jo 


9 


9 


so (A.6) holds. By (H2) and (H4), there exist zo < 0 and cq > 0 such that h'{z) < —cq for all 
z < zq. Because h is polynomially bounded. 


lim 
2^ — 00 


{h{y + z) — h{z))e dy = 0. 


'20-2 


Then by ( |A.8[ ), 

^2 rzo-z A^Co 

lim sup g'o (- 2 ) = limsup— / {h(y + z) — h(z))e~^^ dy < — lim - / e 

2^-00 H Jq 


2^—00 JO 

which leads to ( |A.7[ ). 

Proof of Lemma\^ For ^ > 0 and s G M, put 


^yydy = - — , 
9 


□ 


f) = go{s + 0 - 9o{s) = 


fs+i 


5-0(2/) dy. 


By (6.1), (6.2), and (H3), G is continuously differentiable on M x (0,oo). If G{s,ff) = 0, we must 
have s < z* < s + ^ by (6.5). Let ^ > 0 be fixed. Then, G{s,f) is continuous and strictly increasing 
in s on [z* — f,,z*], with G{z* — < 0 and G{z*,(,) > 0. Hence, there exists a unique s = s{(,) 

such that G{s,f,) = 0, by which we deduce that both (6.6) and (6.7) hold. The limits in ( |6.8| ) 
follows from (6.4) and (6.6). Using (6.5) again, we obtain 

= 9oiS{0) - 9ois{^)) >0 for ^ > 0. 

By the implicit function theorem (see, e.g.. Theorem 11.1 in Protter||1998), s must be a differentiable 
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function on (0, oo). By (6.5)”(6.7), 


^G(.5(o, a = si(m+a = susm > 


The implicit function theorem also implies that 

dG(§(0,0/dC 


5'(0 = - 




dG(mo/ds 


< 0 . 


It follows from (6.5)-(6.7) that 


5'(0 = 5'(6 +1 = -- 


ff'om) 


g'oiSiO) - 9[>m) 


> 0 . 


□ 


Proof of Lemma\^ By ( |6.5[ ) and (6.11), 0(0) =7(z*,0) = 7(5(0), 0), so (6.13) holds for ^ = 0. For 
^ > 0, Lemmaimplies that 5 = s(^) is the unique solution to d^{s,f,)/ds = 0. By (6.5)-(6.7), 


d' 


^7(5(a,a = ^s/oism - >», 




so (6.13) also holds for ^ > 0. 


By (2.5) and (6.11), 


liminf 9{f,) = {k + i)fi + liminf 




.5(0 


ao 




goiy) dy. 


Since (6.7) implies that 5(0+) = z*, we obtain liminf^^o^(0 = ^(0); so 9 is lower semicontinuous 
at zero. For ^ > 0, since K is lower semicontinuous, by Proposition B.l in 


Puterman 


(1994), 


'+>< liminf 

{ e^« f 

By Lemma s is continuous on [0,oo), by which we obtain 


rS{0 rS(i) 

/ 5o(y)dy = lim/ go{y)dy 

Jsii) 4^(0 


siO 


It follows that 9{i) < liminf^_j_|0(1), and thus 0 is lower semicontinuous on [0,oo). 
By (6.8) and L’Hopital’s rule, 

lim - / goiy) dy = lim (5'(0 + l)5o(5'(0) - dm 5'(05'o(s(0)- 
5^00 4 5-)-oo $-)-oo 


Then using (6.4), (6.6), and (6.8), we obtain 

1 fSiO 


lim ^ / goiy) dy = lim goiSif)) = 00. 
C^oo 4 4->-oo 
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Because the setup cost is nonnegative, the above limit implies that lim^_,.oo 0(0 = oo. 


□ 


Proof of Lemma 0 Put K = sup{iir(0 ■ f, >0}, which is finite by ( |^ . By ( |6.14[ ), there exists 
0 < ^ < oo such that 

0(O>^+7(^*,5*)- (A.9) 


Take s = 5(0- If s > s, 7(s,0 = by (6.11). Then, (6.15) follows from (3.2) and (6.10). 

It remains to prove ( |6.15 ) for s < s, which relies on the following inequalities below deduced 
from 

i 9o{y) < go{s) = go{s + f) fors<y<s + 0 
[go{y) > 9oi§) = go{s + 0 for y < s or ?/ > s + 0 


If s < s — 0 


I I rs+? 

go{y ^ s) dy = go{s) >-^ J 9o{y)dy. 


(A.IO) 


Then by (6.13) and (A.9), 


7(s, f)>kg+ + I go{y) dy = 0(0 + > 7(s*, ^O- 


If s — < s < s A (s + — 0) 


r§+^ 

5 o(yVs)dy=/ goiy)dy + {s - s)go{s) > go{y)dy, 
J s J s 


which implies that 7(5,0 > 7(^,0) (6.15) follows. Ifs + ^ — ^ < s < s, 

rs+f 

/ go{y V s) dy > (s - s)yo(s) + / 5o(y) dy + (s + ^ - s - 09o(s + 0 

J s J s 

rs+f 

= / 5o(y)dy + (^-05o(s), 


where the last equality follows from (6.6). Then, 


1 /■"+« 


1 1 

5 o(yVs)dy = -y go{y)dy-\ - —go{s)>^J go{y) 


1 


dy, 


where the last inequality follows from (A.IO). Since the above inequality is identical to (A.IO), we 
deduce that ( 6.15[ ) holds. □ 

Proof of Lemma\^ Clearly, Ym G L(m- Both from Zm{0—) = Z{0—) = x, these two inventory levels 
satisfy Zm{d) < Z{0) by ([j^-([j^. For t > 0, let 

to = sup{u € [0,t] : u € Im \ J, AYmiu) > 0}, 


with the convention sup0 = 0. Then, Zm{to) < Z{to) by (J4). If there is some u G {to,t] for which 
AYm{u) > 0, it is of type (J2) or (J3) and thus AYm{u) < AY{u). Because Y'^ is nondecreasing, it 
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follows from (7.2) that Y^{t) — Y^{to) < Y^{t) — Y'^{to). Hence, Zm{t) < Z{t) for t >0. 


Consider some t > 0 for which Zm{t) <0. If x < m/2 and Zm{u) < m/2 for all u G [0,t], we 
have Y//{t) = Y^{t) by ( |7.2[ ) and AYmiu) = AY{u) for all u G [0,t] by (J2). Hence, Zm,{t) = Z{t). 
If there exists some ti G [0,t) such that Zm{ti) > m/2, let us consider the time 


t2 = sup{m G : Zm{u) > 0}. 

Since Zm does not have downward jumps, we have Zm{t2—) = 0 and AYm{t2) = 0, which yields 


Zm{t 2 ) = 0. Then, Z{t 2 ) > 0 because Zm{t 2 ) < Z{t 2 )- If Z{t 2 ) > 0, we can deduce from (J2)- 
0 that AYm{t 2 ) > 0, a contradiction. Hence, Zm{t 2 ) = Z{t 2 ) = 0. Because Zm{u) < 0 for all 
u G [t 2 ,t], we have — Y//{t 2 ) = Y^{t) — Y^{t 2 ) by ( |7.2[ ) and AYm{u) = AY (u) for all u G [t 2 ,t] 

by (J2). It follows that Zm{t) = Z{t) <Q holds. □ 


Proof of Lemma\^ Let us first prove the uniqueness and monotonicity of the solutions to (8.13) 
(8.14). By (|6.5|), go has the minimum value at z 


I{u) = [ 

J Qi 


Put 


A(?/)dy ioiu>go{z*). 


ao{z*) 

Then, I{u) is a continuous function of u, with I{go{z*)) = 0. Because A(y) is nondecreasing in y 
and A{y) > 0 for y > go{z*), I{u) is strictly increasing in u when u > go{z*) and I{u) —>■ oo as 
u —>■ oo. Hence, for each k > 0, there is a unique u > go{z*) such that I{u) = k. In addition, u is 


strictly increasing in k. By (3.3) and (6.1), go{z*) = h{z*)/g, so the solution to (8.13) is unique 


and i> is strictly increasing in k. Note that go(z*) < —k + k/g. The uniqueness of the solution to 


(8.14) also follows from (6.5). Moreover, s is strictly decreasing in v and S is strictly increasing in 


z>. Then, their monotonicity in k follows from that of v. The monotonicity of ^ in ac follows from 
the fact that ^ = S — s. 


Next, let us prove the optimality of {s,S). When k = 0, by (6.11) and (6.13), 


OiO = 


g 

kg+-i 9 o{y)dy for ^ > 0, 


.kg + ggo{z* 


for ^ = 0. 


By (|6.5|), 9{0) < 6{f) for ^ > 0, so ^ = 0 is the unique solution to (8.11). If k > 0, we obtain 


0(0) = oo by ( 6.11| ) since £ = oo. Hence, ^ = 0 if and only if k = 0. By (3.3) and (6.1), 


u = kg + h{z*) and s = S = z*, and they satisfy (8.13) and (|8.14), respectively. 


When K > 0, the uniqueness of ^ follows from Lemma 10, It remains to show D satisfies (8.13) 


and {s,S) satisfies (8.14). By (6.6), (6.11), (6.13), and the fact that L(^) = k, we obtain 


9{i) = kg + ggo{s{i)). 


Then, (8.14) follows from (6.6) and (8.12). By Tonelli’s theorem and (6.5), 


rso(s(0) rS(S,) r9o{s{£,)) roc 

L{f/)= i / l(-oo,y](5o(tt))dudy = / / l(-cc,y\{9o{u)) ^iudy = I{go{s{i))). 

Jgoiz*) Jgo{z*) J-oo 
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Using (8.14) and the fact that L{^) = k, we obtain I{—k + i>//x) = k, so 0 satisfies (8.13). 


□ 


Proof of LemmafU^ By ([^-(1^, 50(2/) < 5o(5(0) = 5o(5'(0) for s{i) <y < S{f). Then by ( |(h6 ) 
and (6.9), L is continuous and strictly increasing, with L(0) = 0. By ( |6.4| ) and (6.8), L{ff) —>■ 00 as 
^ —>■ 00. It follows that for each k > 0, there is a unique ^ > 0 such that L{ff) = k. 


By (6.6), (6.11), and (6.13), 


0(e) = A;^+^ + ^ 




i i Jsii) 


50 (y) dy. 


the first derivative of which is 


= / To(!/)<i!/+9{9„(S({))(s'({) + l)-9„(,5(0)S'(0). 

'm ? 


'■s(e) 




^2 ^2 


Using (6.6) again, we obtain 


- -ft _ t 

^ ^2 ^2 


/ 5 o(y) dy + ^yo(s( 0 ) = "■ ^ Ji —~ 

’m ? ? 


Then, (8.15) follows from the fact that T(^) = n and the monotonicity of L. 


□ 


Proof of Lemma 1J_. Suppose that there exists some n G \ M such that x(n) does not exist, 
i.e., K{f*) ^ Kj ior j = I,... ,n — 1 . Since Ki > 0, Lemmaimplies that > 0 = Qq, so 1 ^ J\f<^ 
and n > 2 . Because = Qn-i and / Kn, K{ft) = K{Qn-i) = Kn-i. If n - 1 G A/’>, we 

should have = K{Qn-i) = Kn-i, contradicting the hypothesis that / Kn-i. It 

follows that n — 1 G A/’< \ Af. By induction, we obtain { 1 , ..., n — 1 } C A/’< \ Af, which contradicts 
the fact that 1 ^ A/’<. Hence, x(n) must exist. 

For n G Af<: \Af, the above arguments also imply that {x(^) + 1, • •., n} C A/’< \Af, which yields 


K. 


x(n) < ■■■ < Kn. By Lemma 


^xin) < ^x(n)+i < Qxin)^ SO x(n) G Afo U AA<. It follows that 

k'n = dn{Qn-l) > ^ ^ ~ ’ 

where the first equality is due to the fact that = Qn-i, the first inequality is due to the fact 
that K^(n) < Kn, and the second inequality is due to (8.15) and the fact that 


^x(n) — ^x(n) ^ ^X(n) — Qn-1- 
Since Uj = Vj for j G Af, we obtain Vn > ^x{ri)- 

Using the fact that (,n < Qn = 00 , we can follow similar arguments to prove that x(^) exists 
and that z^x(n) < fn for n G A/’> \ Af. The details are thus omitted. □ 
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